Added predictive value of omics data: specific issues related to validation illustrated by two case studies
نویسندگان
چکیده
BACKGROUND In the last years, the importance of independent validation of the prediction ability of a new gene signature has been largely recognized. Recently, with the development of gene signatures which integrate rather than replace the clinical predictors in the prediction rule, the focus has been moved to the validation of the added predictive value of a gene signature, i.e. to the verification that the inclusion of the new gene signature in a prediction model is able to improve its prediction ability. METHODS The high-dimensional nature of the data from which a new signature is derived raises challenging issues and necessitates the modification of classical methods to adapt them to this framework. Here we show how to validate the added predictive value of a signature derived from high-dimensional data and critically discuss the impact of the choice of methods on the results. RESULTS The analysis of the added predictive value of two gene signatures developed in two recent studies on the survival of leukemia patients allows us to illustrate and empirically compare different validation techniques in the high-dimensional framework. CONCLUSIONS The issues related to the high-dimensional nature of the omics predictors space affect the validation process. An analysis procedure based on repeated cross-validation is suggested.
منابع مشابه
On the Simultaneous Analysis of Clinical and Omics Data: A Comparison of Globalboosttest and Pre-validation Techniques
In medical research biostatisticians are often confronted with supervised learning problems involving different kinds of predictors including, e.g., classical clinical predictors and high-dimensional “omics” data. The question of the added predictive value of high-dimensional omics data given that classical predictors are already available has long been under-considered in the biostatistics and...
متن کاملInvestigating the prediction ability of survival models based on both clinical and omics data: two case studies.
In biomedical literature, numerous prediction models for clinical outcomes have been developed based either on clinical data or, more recently, on high-throughput molecular data (omics data). Prediction models based on both types of data, however, are less common, although some recent studies suggest that a suitable combination of clinical and molecular information may lead to models with bette...
متن کاملRisk prediction models: II. External validation, model updating, and impact assessment.
Clinical prediction models are increasingly used to complement clinical reasoning and decision-making in modern medicine, in general, and in the cardiovascular domain, in particular. To these ends, developed models first and foremost need to provide accurate and (internally and externally) validated estimates of probabilities of specific health conditions or outcomes in the targeted individuals...
متن کاملObject-oriented regression for building predictive models with high dimensional omics data from translational studies
Maturing omics technologies enable researchers to generate high dimension omics data (HDOD) routinely in translational clinical studies. In the field of oncology, The Cancer Genome Atlas (TCGA) provided funding support to researchers to generate different types of omics data on a common set of biospecimens with accompanying clinical data and has made the data available for the research communit...
متن کاملValidation of asthma recording in electronic health records: a systematic review
Objective To describe the methods used to validate asthma diagnoses in electronic health records and summarize the results of the validation studies. Background Electronic health records are increasingly being used for research on asthma to inform health services and health policy. Validation of the recording of asthma diagnoses in electronic health records is essential to use these databases...
متن کامل